Rapid Development of New Language Pairs at SYSTRAN
نویسندگان
چکیده
One of the mottos of pure statistical MT system promoters is that it is possible to “build a new language pair” overnight, but the development of a new language pair for a proficient rule-based translator requires a great amount of effort in linguistic rules and resources description. Therefore, we are interested in rapid development techniques for rule-based systems. In this paper, we present the work that was conducted at SYSTRAN during the past year: we successfully developed 12 new language pairs in one year. This led us to shift some architecture paradigms in our translators, to expand implementations with the notion of Linguistic Families, to open our interfaces to more readable formats, and to design ways to work with fully multisource / multitarget aligned dictionaries in order to save time, in particular with the coding effort.
منابع مشابه
New Generation Systran Translation System
In this paper, we present the design of the new generation Systran translation systems, currently utilized in the development of English-Hungarian, English-Polish, English-Arabic, French-Arabic, Hungarian-French and Polish-French language pairs. The new design, based on the traditional Systran machine translation expertise and the existing linguistic resources, addresses the following aspects: ...
متن کاملResearch on machine translation at the University of Saarbrücken
Research in the field of automatic analysis of language and machine translation has a long tradition at the University of Saarbrücken. In the late 1950s, a first attempt was made at the Institute of Applied Mathematics to develop a system for the automatic translation of Latin sentences (taken from a secondary school textbook) into German. In the early 1960s, a small group of researchers and st...
متن کاملSystran Mt Dictionary Development
SYSTRAN has demonstrated success in the MT field with its long history spanning nearly 30 years. As a general-purpose fully automatic MT system, SYSTRAN employs a transfer approach. Among its several components, large, carefully encoded, high-quality dictionaries are critical to SYSTRAN's translation capability. A total of over 2.4 million words and expressions are now encoded in the dictionari...
متن کاملTerminologie et Traduction , no . 1 , 1986 ] CURRENT SYSTRAN DEVELOPMENTS AT THE EC COMMISSION
In 1976, when the Commission first started to develop Systran for the EnglishFrench language pair, a great deal of scepticism was expressed from a wide variety of circles. Most of the translators working on the original development team left in desperation after two or three months while potential users who had the opportunity to see the raw output in those early days almost invariably ridicule...
متن کاملEvaluation of a Machine Translation System for Low Resource Languages: METIS-II
In this paper we describe the METIS-II system and its evaluation on each of the language pairs: Dutch, German, Greek, and Spanish to English. The METIS-II system envisaged developing a data-driven approach in which no parallel corpus is required and in which no full parser or extensive rule sets are needed. We describe the evaluation on a development test set and on a test set taken from Europa...
متن کامل